COSTE: Complexity-based OverSampling TEchnique to alleviate the class imbalance problem in software defect prediction

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem

Software development projects inevitably accumulate defects throughout the development process. Due to the high cost that defects can incur, careful consideration is crucial when predicting which sections of code are likely to contain defects. Classification algorithms used in machine learning can be used to create classifiers which can be used sensitive classification methods attempt to make p...

متن کامل

Using Class Imbalance Learning for Cross-Company Defect Prediction

Cross-company defect prediction (CCDP) is a practical way that trains a prediction model by exploiting one or multiple projects of a source company and then applies the model to target company. Unfortunately, the performance of such CCDP models is susceptible to the high imbalanced nature between the defect-prone and non-defect classes of CC data. Class imbalance learning is applied to alleviat...

متن کامل

Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process

Fault detection prediction of FAB (wafer fabrication) process in semiconductor manufacturing process is possible that improve product quality and reliability in accordance with the classification performance. However, FAB process is sometimes due to a fault occurs. And mostly it occurs “pass”. Hence, data imbalance occurs in the pass/fail class. If the data imbalance occurs, prediction models a...

متن کامل

Stability of Software Defect Prediction in Relation to Levels of Data Imbalance

Software defect prediction is recognized as one of the most important ways to reach software development efficiency. The majority of costs during software development is spent on software defect detection activities, but their ability to guarantee software reliability is still limited. The analyses performed by [Andersson and Runeson 2007; Fenton and Ohlsson 2000; Galinac Grbac et al. 2013], in...

متن کامل

The Class Imbalance Problem in Author Identification

Author identification can be seen as a single-label multi-class text categorization problem. Very often, there are extremely few training texts at least for some of the candidate authors or there is a significant variation in the text-length among the available training texts of the candidate authors. Moreover, in this task usually there is no similarity between the distribution of training and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Information and Software Technology

سال: 2021

ISSN: 0950-5849

DOI: 10.1016/j.infsof.2020.106432